Selective multi-path acoustic model based on database likelihoods

نویسندگان

Akinobu Lee

Yuichiro Mera

Hiroshi Saruwatari

Kiyohiro Shikano

چکیده

An efficient multi-path acoustic model based on database likelihoods for spontaneous speech recognition is presented. Although a multi-path phone HMM that has several mod els for di仔erent target in parallel is considered effective to express multi-style or speed-variant nature of spontaneous speech， assuming various model to match at every time for all phones may cause mismatch of unintended model， and spoil the model constraints. We propose defining a multi path model that has several different state resolutions only for the distortive phones selectively. The phone set is se lected through an analysis of the likelihoods and duration times of phone segments in a spoken dialogue co叩us using automatic viterbi alignment. Experiments on three testsets showed that our multi-path model based on the phone selec tion can achieve better accuracy than a simple single-path model， whereas a full multi-pa出model without phone se lection causes much degradation of accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Implementation of an adaptive burst DQPSK receiver over shallow water acoustic channel

In an environment such as underwater channel where placing test equipments are difficult to handle, it is much practical to have hardware simulators to examine suitably designed transceivers (transmitter/receiver). The simulators of this kind will then allow researchers to observe their intentions and carry out repetitive tests to find suitable digital coding/decoding algorithms. In this p...

متن کامل

A Simulation Study of Multi-path Characteristics of Acoustic Propagation in the Strait of Hormuz

Multi-path interference due to boundary reflection in shallow water acoustic communication poses a major obstacle to reliable and high-speed underwater communication system. In this study, initially 3D variations of field data such as sound speed, temperature and salinity in horizontal transects of the Strait of Hormuz were analyzed using the ROPME data. Later, data on typical sound speed ...

متن کامل

Frame level likelihood transformations for ASR and utterance verification

In most of the current speech recognition systems based on HMM, existing decoding and utterance veri cation methods make use of state output likelihood as a measure of the acoustic match between the input data and the acoustic models. In this paper, we present a new and more generalized approach to the formation of the acoustic match score. The essence of this approach is to transform the likel...

متن کامل

Unconstrained versus constrained acoustic normalisation in confidence scoring

In HMM-based recognition systems for large vocabulary, the observation likelihoods provided by the acoustic models are useful in confidence measures if they are properly normalised. This paper compares two normalisation methods for the acoustic model likelihoods: unconstrained normalisation, based on the unconditional observation likelihood, and constrained normalisation, based on the observati...

متن کامل